Overview

Dataset statistics

Number of variables16
Number of observations4384
Missing cells2668
Missing cells (%)3.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory369.6 KiB
Average record size in memory86.3 B

Variable types

Numeric11
Categorical5

Alerts

sqft_basement is highly correlated with grade and 5 other fieldsHigh correlation
bathrooms is highly correlated with grade and 4 other fieldsHigh correlation
bedrooms is highly correlated with sqft_basement and 4 other fieldsHigh correlation
sqft_above is highly correlated with grade and 7 other fieldsHigh correlation
sqft_living15 is highly correlated with grade and 4 other fieldsHigh correlation
floors is highly correlated with bedrooms and 2 other fieldsHigh correlation
price is highly correlated with grade and 3 other fieldsHigh correlation
sqft_living is highly correlated with grade and 7 other fieldsHigh correlation
waterfront is highly correlated with viewHigh correlation
sqft_lot is highly correlated with sqft_lot15High correlation
sqft_lot15 is highly correlated with sqft_lotHigh correlation
view is highly correlated with waterfrontHigh correlation
grade is highly correlated with sqft_basement and 5 other fieldsHigh correlation
sqft_basement has 2668 (60.9%) missing values Missing
level_0 has unique values Unique
df_index has unique values Unique

Reproduction

Analysis started2022-09-23 03:35:23.200510
Analysis finished2022-09-23 03:35:44.997995
Duration21.8 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

level_0
Real number (ℝ≥0)

UNIQUE

Distinct4384
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18655.07596
Minimum14
Maximum111906
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:45.151630image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile1162.6
Q16274
median14059
Q326827.25
95-th percentile52819.9
Maximum111906
Range111892
Interquartile range (IQR)20553.25

Descriptive statistics

Standard deviation16323.09487
Coefficient of variation (CV)0.8749948224
Kurtosis1.795466893
Mean18655.07596
Median Absolute Deviation (MAD)9398
Skewness1.336983306
Sum81783853
Variance266443426.3
MonotonicityNot monotonic
2022-09-22T22:35:45.351642image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
328281
 
< 0.1%
35631
 
< 0.1%
35971
 
< 0.1%
193121
 
< 0.1%
276721
 
< 0.1%
267901
 
< 0.1%
488441
 
< 0.1%
269711
 
< 0.1%
106761
 
< 0.1%
108851
 
< 0.1%
Other values (4374)4374
99.8%
ValueCountFrequency (%)
141
< 0.1%
151
< 0.1%
181
< 0.1%
201
< 0.1%
271
< 0.1%
291
< 0.1%
301
< 0.1%
311
< 0.1%
411
< 0.1%
421
< 0.1%
ValueCountFrequency (%)
1119061
< 0.1%
937921
< 0.1%
919321
< 0.1%
916731
< 0.1%
851471
< 0.1%
851261
< 0.1%
838691
< 0.1%
837881
< 0.1%
837461
< 0.1%
829901
< 0.1%

df_index
Real number (ℝ≥0)

UNIQUE

Distinct4384
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10927.9391
Minimum1
Maximum21985
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:45.533655image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1212.75
Q15473.5
median10775.5
Q316344.75
95-th percentile20852.85
Maximum21985
Range21984
Interquartile range (IQR)10871.25

Descriptive statistics

Standard deviation6280.36
Coefficient of variation (CV)0.5747067168
Kurtosis-1.189728993
Mean10927.9391
Median Absolute Deviation (MAD)5441
Skewness0.03233114678
Sum47908085
Variance39442921.72
MonotonicityNot monotonic
2022-09-22T22:35:45.959628image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
61591
 
< 0.1%
97791
 
< 0.1%
206011
 
< 0.1%
114401
 
< 0.1%
128201
 
< 0.1%
58041
 
< 0.1%
159431
 
< 0.1%
16651
 
< 0.1%
207231
 
< 0.1%
2011
 
< 0.1%
Other values (4374)4374
99.8%
ValueCountFrequency (%)
11
< 0.1%
191
< 0.1%
231
< 0.1%
281
< 0.1%
411
< 0.1%
431
< 0.1%
581
< 0.1%
601
< 0.1%
611
< 0.1%
671
< 0.1%
ValueCountFrequency (%)
219851
< 0.1%
219761
< 0.1%
219711
< 0.1%
219681
< 0.1%
219651
< 0.1%
219641
< 0.1%
219631
< 0.1%
219511
< 0.1%
219471
< 0.1%
219441
< 0.1%

grade
Real number (ℝ≥0)

HIGH CORRELATION

Distinct11
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.65419708
Minimum3
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:46.094636image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile6
Q17
median7
Q38
95-th percentile10
Maximum13
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.178149492
Coefficient of variation (CV)0.153922022
Kurtosis1.079601273
Mean7.65419708
Median Absolute Deviation (MAD)1
Skewness0.7624781624
Sum33556
Variance1.388036225
MonotonicityNot monotonic
2022-09-22T22:35:46.205645image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
71828
41.7%
81228
28.0%
9508
 
11.6%
6407
 
9.3%
10254
 
5.8%
1176
 
1.7%
560
 
1.4%
1216
 
0.4%
133
 
0.1%
43
 
0.1%
ValueCountFrequency (%)
31
 
< 0.1%
43
 
0.1%
560
 
1.4%
6407
 
9.3%
71828
41.7%
81228
28.0%
9508
 
11.6%
10254
 
5.8%
1176
 
1.7%
1216
 
0.4%
ValueCountFrequency (%)
133
 
0.1%
1216
 
0.4%
1176
 
1.7%
10254
 
5.8%
9508
 
11.6%
81228
28.0%
71828
41.7%
6407
 
9.3%
560
 
1.4%
43
 
0.1%

sqft_basement
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct202
Distinct (%)11.8%
Missing2668
Missing (%)60.9%
Infinite0
Infinite (%)0.0%
Mean741.6491841
Minimum10
Maximum3480
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:46.371609image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile190
Q1450
median700
Q3970
95-th percentile1452.5
Maximum3480
Range3470
Interquartile range (IQR)520

Descriptive statistics

Standard deviation399.0579306
Coefficient of variation (CV)0.53806832
Kurtosis2.657961737
Mean741.6491841
Median Absolute Deviation (MAD)260
Skewness1.041309076
Sum1272670
Variance159247.232
MonotonicityNot monotonic
2022-09-22T22:35:46.554941image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
70049
 
1.1%
60047
 
1.1%
50047
 
1.1%
40036
 
0.8%
80035
 
0.8%
90033
 
0.8%
30032
 
0.7%
100031
 
0.7%
62025
 
0.6%
110022
 
0.5%
Other values (192)1359
31.0%
(Missing)2668
60.9%
ValueCountFrequency (%)
101
 
< 0.1%
201
 
< 0.1%
401
 
< 0.1%
601
 
< 0.1%
703
 
0.1%
805
0.1%
902
 
< 0.1%
1009
0.2%
1103
 
0.1%
1203
 
0.1%
ValueCountFrequency (%)
34801
< 0.1%
32601
< 0.1%
27301
< 0.1%
23301
< 0.1%
22202
< 0.1%
21601
< 0.1%
21501
< 0.1%
21001
< 0.1%
20601
< 0.1%
20401
< 0.1%

view
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
0.0
3966 
2.0
 
189
3.0
 
90
1.0
 
72
4.0
 
67

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13152
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.03966
90.5%
2.0189
 
4.3%
3.090
 
2.1%
1.072
 
1.6%
4.067
 
1.5%

Length

2022-09-22T22:35:46.728953image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T22:35:46.860963image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.03966
90.5%
2.0189
 
4.3%
3.090
 
2.1%
1.072
 
1.6%
4.067
 
1.5%

Most occurring characters

ValueCountFrequency (%)
08350
63.5%
.4384
33.3%
2189
 
1.4%
390
 
0.7%
172
 
0.5%
467
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8768
66.7%
Other Punctuation4384
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
08350
95.2%
2189
 
2.2%
390
 
1.0%
172
 
0.8%
467
 
0.8%
Other Punctuation
ValueCountFrequency (%)
.4384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
08350
63.5%
.4384
33.3%
2189
 
1.4%
390
 
0.7%
172
 
0.5%
467
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII13152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
08350
63.5%
.4384
33.3%
2189
 
1.4%
390
 
0.7%
172
 
0.5%
467
 
0.5%

bathrooms
Real number (ℝ≥0)

HIGH CORRELATION

Distinct25
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.111656022
Minimum0.5
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:46.984972image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile1
Q11.75
median2.25
Q32.5
95-th percentile3.5
Maximum8
Range7.5
Interquartile range (IQR)0.75

Descriptive statistics

Standard deviation0.7709681102
Coefficient of variation (CV)0.3651011823
Kurtosis1.625505159
Mean2.111656022
Median Absolute Deviation (MAD)0.5
Skewness0.5490073892
Sum9257.5
Variance0.5943918269
MonotonicityNot monotonic
2022-09-22T22:35:47.126983image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
2.51065
24.3%
1797
18.2%
1.75638
14.6%
2.25423
 
9.6%
2392
 
8.9%
1.5268
 
6.1%
2.75246
 
5.6%
3.5158
 
3.6%
3154
 
3.5%
3.25115
 
2.6%
Other values (15)128
 
2.9%
ValueCountFrequency (%)
0.51
 
< 0.1%
0.7515
 
0.3%
1797
18.2%
1.254
 
0.1%
1.5268
 
6.1%
1.75638
14.6%
2392
 
8.9%
2.25423
 
9.6%
2.51065
24.3%
2.75246
 
5.6%
ValueCountFrequency (%)
81
 
< 0.1%
7.51
 
< 0.1%
61
 
< 0.1%
5.751
 
< 0.1%
5.51
 
< 0.1%
5.254
 
0.1%
53
 
0.1%
4.755
 
0.1%
4.518
0.4%
4.2514
0.3%

bedrooms
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.367244526
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:47.267992image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median3
Q34
95-th percentile5
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9073855097
Coefficient of variation (CV)0.2694741955
Kurtosis1.196311618
Mean3.367244526
Median Absolute Deviation (MAD)1
Skewness0.4115450537
Sum14762
Variance0.8233484631
MonotonicityNot monotonic
2022-09-22T22:35:47.396002image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
31942
44.3%
41417
32.3%
2580
 
13.2%
5339
 
7.7%
148
 
1.1%
646
 
1.0%
79
 
0.2%
92
 
< 0.1%
81
 
< 0.1%
ValueCountFrequency (%)
148
 
1.1%
2580
 
13.2%
31942
44.3%
41417
32.3%
5339
 
7.7%
646
 
1.0%
79
 
0.2%
81
 
< 0.1%
92
 
< 0.1%
ValueCountFrequency (%)
92
 
< 0.1%
81
 
< 0.1%
79
 
0.2%
646
 
1.0%
5339
 
7.7%
41417
32.3%
31942
44.3%
2580
 
13.2%
148
 
1.1%

sqft_above
Real number (ℝ)

HIGH CORRELATION

Distinct490
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.641866777 × 10-15
Minimum-3.727374881
Maximum3.332527561
Zeros0
Zeros (%)0.0%
Negative2256
Negative (%)51.5%
Memory size34.4 KiB
2022-09-22T22:35:47.557014image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-3.727374881
5-th percentile-1.607801805
Q1-0.6910411929
median-0.04132922053
Q30.7417632297
95-th percentile1.638361521
Maximum3.332527561
Range7.059902442
Interquartile range (IQR)1.432804423

Descriptive statistics

Standard deviation1.000114071
Coefficient of variation (CV)2.154551431 × 1014
Kurtosis-0.4310489928
Mean4.641866777 × 10-15
Median Absolute Deviation (MAD)0.7145397924
Skewness0.01502063272
Sum2.052602532 × 10-11
Variance1.000228154
MonotonicityNot monotonic
2022-09-22T22:35:47.709025image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.64892312946
 
1.0%
-0.507956128344
 
1.0%
-0.488589240944
 
1.0%
-0.527509092441
 
0.9%
-0.587317479141
 
0.9%
-1.1400761341
 
0.9%
-0.868606057140
 
0.9%
-0.412919998839
 
0.9%
-0.800237511839
 
0.9%
-0.691041192939
 
0.9%
Other values (480)3970
90.6%
ValueCountFrequency (%)
-3.7273748811
 
< 0.1%
-3.3655165931
 
< 0.1%
-3.0482296411
 
< 0.1%
-2.9892762261
 
< 0.1%
-2.9316776581
 
< 0.1%
-2.8753786071
 
< 0.1%
-2.8203269531
 
< 0.1%
-2.7664735513
0.1%
-2.7137720091
 
< 0.1%
-2.6621784913
0.1%
ValueCountFrequency (%)
3.3325275611
< 0.1%
3.1959093431
< 0.1%
2.7593338011
< 0.1%
2.6827673931
< 0.1%
2.6335230841
< 0.1%
2.5760309541
< 0.1%
2.5363944041
< 0.1%
2.4713467261
< 0.1%
2.4643285991
< 0.1%
2.4323262951
< 0.1%

sqft_living15
Real number (ℝ)

HIGH CORRELATION

Distinct441
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.953053377 × 10-15
Minimum-3.362638675
Maximum3.0395625
Zeros0
Zeros (%)0.0%
Negative2246
Negative (%)51.2%
Memory size34.4 KiB
2022-09-22T22:35:47.869315image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-3.362638675
5-th percentile-1.623301191
Q1-0.6838419449
median-0.04577950829
Q30.7214310906
95-th percentile1.678594989
Maximum3.0395625
Range6.402201175
Interquartile range (IQR)1.405273036

Descriptive statistics

Standard deviation1.000114071
Coefficient of variation (CV)2.019186943 × 1014
Kurtosis-0.3126481893
Mean4.953053377 × 10-15
Median Absolute Deviation (MAD)0.7048474896
Skewness0.00982565812
Sum2.173250468 × 10-11
Variance1.000228154
MonotonicityNot monotonic
2022-09-22T22:35:48.018326image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.452780902944
 
1.0%
-0.555235038243
 
1.0%
-0.393232914539
 
0.9%
-0.597394125939
 
0.9%
-0.240991225538
 
0.9%
-0.640256030837
 
0.8%
-0.819169974737
 
0.8%
-0.937546524337
 
0.8%
-0.354299305837
 
0.8%
-0.79612190236
 
0.8%
Other values (431)3997
91.2%
ValueCountFrequency (%)
-3.3626386751
 
< 0.1%
-3.1991677861
 
< 0.1%
-3.1465165121
 
< 0.1%
-3.0438104732
< 0.1%
-2.8481046062
< 0.1%
-2.8104189741
 
< 0.1%
-2.8010699022
< 0.1%
-2.7547507143
0.1%
-2.709127841
 
< 0.1%
-2.6641828142
< 0.1%
ValueCountFrequency (%)
3.03956252
< 0.1%
2.853129051
< 0.1%
2.7499320181
< 0.1%
2.7313804891
< 0.1%
2.6601155832
< 0.1%
2.5809840292
< 0.1%
2.550431471
< 0.1%
2.53496831
< 0.1%
2.5141528172
< 0.1%
2.4931070411
< 0.1%

waterfront
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 KiB
0.0
4349 
1.0
 
35

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13152
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.04349
99.2%
1.035
 
0.8%

Length

2022-09-22T22:35:48.163036image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T22:35:48.284045image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
0.04349
99.2%
1.035
 
0.8%

Most occurring characters

ValueCountFrequency (%)
08733
66.4%
.4384
33.3%
135
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8768
66.7%
Other Punctuation4384
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
08733
99.6%
135
 
0.4%
Other Punctuation
ValueCountFrequency (%)
.4384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
08733
66.4%
.4384
33.3%
135
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII13152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
08733
66.4%
.4384
33.3%
135
 
0.3%

floors
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.484260949
Minimum1
Maximum3.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.4 KiB
2022-09-22T22:35:48.380054image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile2
Maximum3.5
Range2.5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.5388011141
Coefficient of variation (CV)0.3630096948
Kurtosis-0.4517018207
Mean1.484260949
Median Absolute Deviation (MAD)0
Skewness0.6502545024
Sum6507
Variance0.2903066405
MonotonicityNot monotonic
2022-09-22T22:35:48.493061image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
12207
50.3%
21631
37.2%
1.5390
 
8.9%
3124
 
2.8%
2.531
 
0.7%
3.51
 
< 0.1%
ValueCountFrequency (%)
12207
50.3%
1.5390
 
8.9%
21631
37.2%
2.531
 
0.7%
3124
 
2.8%
3.51
 
< 0.1%
ValueCountFrequency (%)
3.51
 
< 0.1%
3124
 
2.8%
2.531
 
0.7%
21631
37.2%
1.5390
 
8.9%
12207
50.3%

sqft_lot
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2.0
1967 
1.0
1106 
3.0
725 
4.0
320 
0.0
266 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13152
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row0.0
3rd row1.0
4th row2.0
5th row4.0

Common Values

ValueCountFrequency (%)
2.01967
44.9%
1.01106
25.2%
3.0725
 
16.5%
4.0320
 
7.3%
0.0266
 
6.1%

Length

2022-09-22T22:35:48.606070image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T22:35:48.738588image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2.01967
44.9%
1.01106
25.2%
3.0725
 
16.5%
4.0320
 
7.3%
0.0266
 
6.1%

Most occurring characters

ValueCountFrequency (%)
04650
35.4%
.4384
33.3%
21967
15.0%
11106
 
8.4%
3725
 
5.5%
4320
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8768
66.7%
Other Punctuation4384
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04650
53.0%
21967
22.4%
11106
 
12.6%
3725
 
8.3%
4320
 
3.6%
Other Punctuation
ValueCountFrequency (%)
.4384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
04650
35.4%
.4384
33.3%
21967
15.0%
11106
 
8.4%
3725
 
5.5%
4320
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII13152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04650
35.4%
.4384
33.3%
21967
15.0%
11106
 
8.4%
3725
 
5.5%
4320
 
2.4%

price
Real number (ℝ)

HIGH CORRELATION

Distinct1550
Distinct (%)35.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.519498234 × 10-13
Minimum-4.669290356
Maximum2.901084695
Zeros0
Zeros (%)0.0%
Negative2052
Negative (%)46.8%
Memory size34.4 KiB
2022-09-22T22:35:48.880599image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-4.669290356
5-th percentile-1.626419319
Q1-0.6054377704
median0.05071254471
Q30.6217544233
95-th percentile1.37058292
Maximum2.901084695
Range7.570375051
Interquartile range (IQR)1.227192194

Descriptive statistics

Standard deviation1.000114071
Coefficient of variation (CV)6.581870571 × 1012
Kurtosis1.346485739
Mean1.519498234 × 10-13
Median Absolute Deviation (MAD)0.6078048319
Skewness0.02751371968
Sum6.662377317 × 10-10
Variance1.000228154
MonotonicityNot monotonic
2022-09-22T22:35:49.032611image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.419414273543
 
1.0%
0.0507125447141
 
0.9%
-0.28315423336
 
0.8%
-0.572253446135
 
0.8%
0.305637302734
 
0.8%
-0.160738853731
 
0.7%
0.227507825630
 
0.7%
-1.17190592930
 
0.7%
0.0115225091429
 
0.7%
0.142844777728
 
0.6%
Other values (1540)4047
92.3%
ValueCountFrequency (%)
-4.6692903561
< 0.1%
-3.9719194521
< 0.1%
-3.9522509091
< 0.1%
-3.8000949412
< 0.1%
-3.7525415051
< 0.1%
-3.6565685951
< 0.1%
-3.6219433521
< 0.1%
-3.398689641
< 0.1%
-3.3924269111
< 0.1%
-3.3613587231
< 0.1%
ValueCountFrequency (%)
2.9010846951
< 0.1%
2.8981082241
< 0.1%
2.8980789541
< 0.1%
2.8980569041
< 0.1%
2.8977778841
< 0.1%
2.8976094741
< 0.1%
2.8974157041
< 0.1%
2.8970826671
< 0.1%
2.896656881
< 0.1%
2.8963334751
< 0.1%

condition
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
3.0
2884 
4.0
1112 
5.0
346 
2.0
 
35
1.0
 
7

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13152
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row3.0
3rd row3.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
3.02884
65.8%
4.01112
 
25.4%
5.0346
 
7.9%
2.035
 
0.8%
1.07
 
0.2%

Length

2022-09-22T22:35:49.184622image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T22:35:49.536716image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
3.02884
65.8%
4.01112
 
25.4%
5.0346
 
7.9%
2.035
 
0.8%
1.07
 
0.2%

Most occurring characters

ValueCountFrequency (%)
.4384
33.3%
04384
33.3%
32884
21.9%
41112
 
8.5%
5346
 
2.6%
235
 
0.3%
17
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8768
66.7%
Other Punctuation4384
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04384
50.0%
32884
32.9%
41112
 
12.7%
5346
 
3.9%
235
 
0.4%
17
 
0.1%
Other Punctuation
ValueCountFrequency (%)
.4384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4384
33.3%
04384
33.3%
32884
21.9%
41112
 
8.5%
5346
 
2.6%
235
 
0.3%
17
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4384
33.3%
04384
33.3%
32884
21.9%
41112
 
8.5%
5346
 
2.6%
235
 
0.3%
17
 
0.1%

sqft_lot15
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2.0
1890 
1.0
1295 
3.0
658 
4.0
294 
0.0
247 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13152
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row1.0
3rd row1.0
4th row2.0
5th row0.0

Common Values

ValueCountFrequency (%)
2.01890
43.1%
1.01295
29.5%
3.0658
 
15.0%
4.0294
 
6.7%
0.0247
 
5.6%

Length

2022-09-22T22:35:49.657725image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-09-22T22:35:49.787130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
ValueCountFrequency (%)
2.01890
43.1%
1.01295
29.5%
3.0658
 
15.0%
4.0294
 
6.7%
0.0247
 
5.6%

Most occurring characters

ValueCountFrequency (%)
04631
35.2%
.4384
33.3%
21890
14.4%
11295
 
9.8%
3658
 
5.0%
4294
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number8768
66.7%
Other Punctuation4384
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04631
52.8%
21890
21.6%
11295
 
14.8%
3658
 
7.5%
4294
 
3.4%
Other Punctuation
ValueCountFrequency (%)
.4384
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13152
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
04631
35.2%
.4384
33.3%
21890
14.4%
11295
 
9.8%
3658
 
5.0%
4294
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII13152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
04631
35.2%
.4384
33.3%
21890
14.4%
11295
 
9.8%
3658
 
5.0%
4294
 
2.2%

sqft_living
Real number (ℝ)

HIGH CORRELATION

Distinct539
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.306357621 × 10-15
Minimum-3.467099061
Maximum4.560854687
Zeros0
Zeros (%)0.0%
Negative2157
Negative (%)49.2%
Memory size34.4 KiB
2022-09-22T22:35:49.937140image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Quantile statistics

Minimum-3.467099061
5-th percentile-1.644378586
Q1-0.6860623492
median0.0006779349509
Q30.7012017549
95-th percentile1.636738313
Maximum4.560854687
Range8.027953748
Interquartile range (IQR)1.387264104

Descriptive statistics

Standard deviation1.000114071
Coefficient of variation (CV)3.024821224 × 1014
Kurtosis-0.1788791482
Mean3.306357621 × 10-15
Median Absolute Deviation (MAD)0.695762459
Skewness-0.001090873618
Sum1.451372356 × 10-11
Variance1.000228154
MonotonicityNot monotonic
2022-09-22T22:35:50.087659image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.000677934950937
 
0.8%
-0.101310676536
 
0.8%
-0.653273463635
 
0.8%
-0.318846968634
 
0.8%
-0.92878635933
 
0.8%
-0.892599198233
 
0.8%
-0.495600381931
 
0.7%
-0.207684446630
 
0.7%
-0.235009155330
 
0.7%
-0.669613040429
 
0.7%
Other values (529)4056
92.5%
ValueCountFrequency (%)
-3.4670990611
< 0.1%
-3.2164742171
< 0.1%
-2.9901437591
< 0.1%
-2.9473862531
< 0.1%
-2.9053931431
< 0.1%
-2.823591881
< 0.1%
-2.7837331591
< 0.1%
-2.7059827492
< 0.1%
-2.6680478352
< 0.1%
-2.5577665511
< 0.1%
ValueCountFrequency (%)
4.5608546871
< 0.1%
3.4804843861
< 0.1%
3.4255310171
< 0.1%
3.2005744031
< 0.1%
3.0009747181
< 0.1%
2.9971308211
< 0.1%
2.8790991321
< 0.1%
2.7931765621
< 0.1%
2.7385288111
< 0.1%
2.7257531
< 0.1%

Interactions

2022-09-22T22:35:42.395926image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:24.665998image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:26.906852image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.423738image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.895423image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.822626image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.592652image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.591142image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.364754image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.948070image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.485429image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:42.542936image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:24.882015image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.051861image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.548259image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:30.029437image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.963638image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.765667image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.741156image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.494761image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.092084image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.853458image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:42.678313image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:25.045026image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.185997image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.673268image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:30.161648image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.131236image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.909677image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.900087image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.645771image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.227948image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.988471image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:42.841325image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:25.227727image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.307009image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.808276image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:30.310658image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.284833image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:34.049686image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.255353image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.777781image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.354960image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.123601image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.007342image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:25.427741image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.450498image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.949999image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:30.470669image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.483361image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:34.228301image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.392366image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.922794image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.497976image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.268613image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.171352image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:25.612755image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.594511image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.082006image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:30.655737image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.643374image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:34.429317image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.536372image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.077410image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.646360image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.426711image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.329161image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:25.796773image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.738520image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.222015image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.067567image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.799895image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:34.665730image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.679383image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.229455image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.788371image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.588231image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.497298image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:26.198798image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:27.874529image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.362022image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.215588image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:32.965411image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:34.862259image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.817393image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.365518image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:39.930381image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.738243image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.666310image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:26.366810image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.008545image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.498031image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.355592image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.115419image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.022405image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:36.948735image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.503525image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.064391image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:41.906256image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.829143image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:26.543827image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.143901image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.627042image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.519604image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.266118image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.209418image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.079744image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.631535image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.198400image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:42.083269image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:43.985633image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:26.730837image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:28.278915image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:29.761904image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:31.667614image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:33.431130image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:35.430432image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:37.228231image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:38.779060image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:40.347415image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
2022-09-22T22:35:42.248281image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Correlations

2022-09-22T22:35:50.224670image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-09-22T22:35:50.416684image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-09-22T22:35:50.605229image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-09-22T22:35:50.785066image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-09-22T22:35:50.944076image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-09-22T22:35:44.282774image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-09-22T22:35:44.663938image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-09-22T22:35:44.836947image/svg+xmlMatplotlib v3.5.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

level_0df_indexgradesqft_basementviewbathroomsbedroomssqft_abovesqft_living15waterfrontfloorssqft_lotpriceconditionsqft_lot15sqft_living
03282861598.0NaN0.02.253.00.2706300.3471400.01.02.0-0.7452153.02.0-0.153925
16012178298.0NaN0.02.503.0-0.394437-1.0362200.03.00.00.1498793.01.0-0.804412
23492055818.0NaN0.03.004.00.9873780.8416290.02.01.00.2275083.01.00.607272
39233387.0NaN0.01.753.0-0.269616-0.2043350.01.02.0-1.0303883.02.0-0.686062
49859139447.0240.00.01.002.0-0.734034-0.7059130.01.04.0-0.3630893.00.0-0.686062
513488100359.0NaN0.02.504.00.6588560.4878150.02.02.00.7011403.02.00.250012
615973219157.01100.00.01.754.0-0.915452-0.4729450.01.02.0-0.2831545.01.00.349666
7242651152410.0NaN0.03.003.01.3870711.0430550.02.04.00.9637123.03.01.062741
8290551357.0NaN0.01.753.0-0.450401-0.5762280.01.03.0-1.7766344.03.0-0.856941
92550740411.0NaN0.03.504.02.1174120.7214310.02.02.01.5846423.02.01.961547

Last rows

level_0df_indexgradesqft_basementviewbathroomsbedroomssqft_abovesqft_living15waterfrontfloorssqft_lotpriceconditionsqft_lot15sqft_living
437426684212358.0NaN0.02.754.01.6211790.4325510.02.02.00.6834503.02.01.340982
43754863088327.0NaN0.02.004.00.648220-0.2409910.01.03.00.2275084.02.00.238689
43765996683527.0800.00.01.003.0-0.041329-0.2970050.01.01.00.3264523.01.00.517561
437738623264111.0NaN0.02.754.01.5467732.1838990.02.03.02.8900963.03.01.251589
4378934393948.0NaN0.02.503.00.605043-1.3279560.02.02.00.9511003.02.00.192875
437914375118327.0NaN0.02.003.0-0.648923-0.5344130.01.02.0-1.1728953.02.0-1.040688
438029775118307.0800.00.02.003.0-0.8227800.1350370.01.03.0-0.1653983.03.00.050140
43815801131377.01220.00.01.753.0-0.3221610.4325510.01.04.0-0.2677803.04.00.759237
43821661025847.0NaN0.01.503.0-0.412920-0.4327760.01.02.0-0.2228113.02.0-0.821796
438323596836.0NaN0.01.002.0-1.420574-0.7281740.01.01.00.0507134.01.0-1.719006